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I I Abstract. Let 'P(E*) be the semiring of languages, and consider its subset 

^^ , 7'(S). In this paper we define the language recognized by a weighted automa- 

^^ 1 ton over 'P(S) and a one-letter alphabet. Similarly, we introduce the notion of 

J ' language recognition by linear recurrence equations with coefficients in ^(2). 

(\ ' As we will see, these two definitions coincide. We prove that the languages rec- 

ognized by linear recurrence equations with coefficients in 'P(S) are precisely 
the regular languages, thus providing an alternative way to present these lan- 
Cn ' guages. A remarkable consequence of this kind of recognition is that it induces 

^ ' a partition of the language into its cross-sections, where the nth cross-section 

•/j , contains all the words of length n in the language. Finally, we show how to 

nJ ■ use linear recurrence equations to calculate the density function of a regular 

^_^ ' language, which assigns to every n the number of words of length n in the 

language. We also show how to count the number of successful paths of a 
f^^ , weighted automaton. 

o 

/— ^ ' Keywords: cross-section of a language, density of a language, language recog- 

nition, recurrence equations, semirings, weighted automata 



lJ ■ 1. Introduction 

Weighted automata are powerful finite-state ruachines in which every transition 
carries a weight from a semiring. These automata have been studied recently in a 
wide range of settings, from very applied fields, like natural language and speech- 
processing (see [TTJ [TUl [5]), to more theoretical ones, like logic (see [5]). In our 
current research, we are particularly interested in the applications of weighted au- 
tomata to formal language theory. 

A finite automaton ([5J |7]) can be regarded as a particular type of weighted 
automaton, by letting the weights come from the Boolean semiring (i.e., the weights 
are either or 1). Thus, the class of weighted automata contains the class of finite 
automata. Kleene's Theorem states that finite automata recognize the regular 
languages. Hence, it is no surprise that weighted automata can be used to recognize 
a class of languages that contains the class of the regular languages. In particular, 
it can be shown that weighted automata can be used to recognize context-free 
languages (see [4]). 
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In our work we are interested in weighted automata over a one-letter alphabet. 
We refer to these automata as counting automata, since they can be used as count- 
ing devices, with applications in combinatorics and enumeration (see |12[ I13[ [Tl] ). 
among others. We start by recalling the definitions of a semiring and a formal 
power series (Section 2). These notions provide the setting we need to associate 
a linear recurrence equation to each state of a counting automaton. In fact, we 
will see that a counting automaton over a semiring K generates a system of linear 
recurrence equations with coefficients in K (Section 3). 

Given our interest in the applications of weighted automata to formal language 
theory, we explore counting automata, and recurrence equations, over the semiring 
of languages, V{Ti*) (Section 4). Specifically, we consider its subset 7^(2). We 
define the language recognized by a counting automaton over 7^(2), and introduce 
the idea of language recognition by linear recurrence equations with coefficients in 
V{Yj). We will see that these two types of language recognition are equivalent. A 
consequence of recognizing a language this way is that we obtain a partition of 
the language into its cross-sections, where the nth cross-section contains all the 
words of length n in the language (see OH])- It is important to notice that this is 
the case because the weights of the automata and the coefficients of the recurrence 
equations come from 7^(2). We will show that the languages recognized by counting 
automata over 7^(S), and by linear recurrence equations with coefficients in 7'(S), 
are closed under certain operations. We then prove that a language recognized by a 
system of linear recurrence equations with coefficients in 7^(S) is regular, and that 
every regular language is recognized by a system of linear recurrence equations with 
coefficients in V{Ti). This result provides a novel way to present this important class 
of languages. 

We conclude this paper by showing how to use linear recurrence equations to 
count, for every n, the number of words of length n in a regular language (Section 
5). That is, we show how to calculate the density Junction of the language (see [15|). 
We will see that the number of words of length n in a language is closely related to 
the number of successful paths of length n in the counting automaton recognizing 
the language. Thus, we start by counting the number of successful paths of any 
given length in an automaton. We do this by constructing an automaton that counts 
the number of successful paths of another automaton. We refer to this machine as a 
path-counting automaton, and we use it to construct the self-counting automaton, 
which we will define as a machine with the ability to count its own successful paths. 
Therefore, for every n, we have a way to (i) generate all the words of length n and 
to {ii) count the number of words of length n in a regular language. 



2. Preliminaries: Semirings and Formal Power Series 

A monoid is a nonempty set on which we define an associative operation, and 
in which there is an identity element. Using this, we can define a semiring. 

Definition 2.1. A semiring K = {K, +, ■, 0, 1) is a set K satisfying 

(1) {K, +,0) is a commutative monoid with identity element 

(2) {K, -,1) is a monoid with identity element 1 

(3) for all a,b,cin K, a-(b + c) — a-b + a-c 

(4) for aU a in X, • a = a • = 
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From the definition we can see that every ring with unity is a semiring. (For 
example, the ring of the real numbers is an example of a semiring.) Some nontrivial 
examples of semirings are the semiring of natural numbers N = (N, +, •, 0, 1), 
the Boolean semiring B = ({0, 1}, V, A,0, 1), and the semiring of languages 

7'(S*) = (7'(S*),U,-,0,{e}), where S is a finite alphabet, E* is the set of aU words 
of finite length over S (e denotes the empty word), and 7^(1]*) is the power set of 
S*, known as the set of languages over E. It is not difficult to see that if Ki and 
K2 are two semirings, then their direct product Ki x K2 is also a semiring. 

Definition 2.2. Let ^ be a finite alphabet and K a semiring. We can define a 
map s : A* —^ K, assigning to every word w £ A* an element c (z K . Such a map 
is known as a formal power series. 

We call 5(1/7) — c the coefficient of w, or the weight of w. Of course, these 
coefficients or weights have different interpretations, depending on the particular 
semiring K. The set of all formal power series s : A* -^ K is usually denoted by 
K((A*)). 

For example, if A is a finite alphabet and K = 3, then notice that for w G A*, 
s{w) is either or 1 (false or true, respectively). Hence, a formal power series 
s e B {{A*)) rejects or accepts a word w £ A* . 

We have seen that, given a semiring K (and a finite alphabet A), we can define 
the set of formal power series K {{A*)). In turn, the set of formal power series 
can be made into a semiring in the following way ([8]). Addition of two series 
si,S2 € K {{A*)) is defined by (si + S2)(w) = Si{w) +S2(w), for all w S A*. The 
series defined by 0{w) = is the identity for the addition. Multiplication of two 
series si,S2 is defined by (si • S2){w) = 2. (^1(^1)) • (52(^2)), for all w £ A*. 

WiW2—'W 

(This operation is known as the Cauchy product of two formal power series.) The 
identity for the product is the series e defined by e{e) = 1, while e{w) = for 
any other word w ^ e. Hence, {K {{A*)) , +, •, 0,e) is a semiring, the semiring of 
formal power series. 

3. Weighted Automata and Recurrence Equations 

A convenient way to represent some formal power series is by means of weighted 
automata ([5]). 

Definition 3.1. Let K = (K,+, -,0, 1) be a semiring and A a finite alphabet. A 

weighted automaton A over K and A is a quadruple A — {Qa,I',t,ip), where 

Q^ is a finite set of states, i, ip : Qa -^ K are functions defining the initial weight 

and the final weight of a state, respectively, and if n is the number of states, 

T : A — > X"^" is the transition weight function. We let t{x) be an {n x 7i)-matrix 

whose (i,j)-entry T{x)ij £ K gives the weight of the transition qi — > qj. If 

/ \ 1-1 ^1^ 

T(xjij = a, we denote this by qi — > qj. 

Notice that the definition of a weighted automaton does not include the notions 
of initial or final states. However, by appropriately defining l and tp, it is possible 
to equip a weighted automaton with initial and final states, as we will see later on. 

Consider now the path F : go — > Qi — > 12 — > ■ ■ ■ — > Qn-i — > q-n m A. 
Denote the length of the path by \P\. Now define the weight of the path by 

ll^ll = t,{qo) • ai • a2 • • • a„ • (fiiqn)- 
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Notice that this path has as label the word w = xia;2 . . .Xn G A*. There might 
be, of course, other paths with label w — a;ia;2 ■ ■ -Xn- We will define the weight of 
the word w in ^ to be the sum of the weights ||P|| over all paths P with label w. 
Denote this by ||^|| (w), and notice that the weight of a word w in yl is a function 
from A* to K. That is, \\A\\ is a formal power series, so \\A\\ E K {{A*)). 

A formal power series s E K {{A*)) is said to be automata recognizable if there 
is a weighted automaton A such that s — \\A\\. In this case we say that A is an 
automata representation for s E K {{A*)). 

In our research we are interested in automata over a one- letter alphabet A = {x}. 

. 1 1 . 1 -7-1 x\ai x\ar> 

Hence, a typical path m such an automaton is P : qo — > Qi — > 52 — ^ • ■ • — ^ 
Qn-i — ? <7n- Given that every transition reads the letter x, we eliminate it from 
the diagram for simplicity, thus making a typical path look like P : Qq — ^ qi -^ 
92 — > ■ ■ ■ — > Qn-i — ^ Qn- Since A = {x}, an arbitrary word w E A* has the 
form w — x" for some n € N. By definition, \\A\\ (w) = \\A\\ (x") equals the sum of 
||P|| over all paths P with label w = x". But since every transition reads the letter 
X, \\A\\ (x") equals the sum of ||P|| over all paths P of length n. We call ||^|| the 
behavior of the automaton A, and define it as 



\\A\\ix-)=J2{\\P\\--\P\=n}. 



Note that in this kind of automaton we are not directly accepting/rejecting words 
over some alphabet, but rather counting paths of length n, and keeping track of the 
weights of such paths. The idea of using automata as counting devices has been 
used recently with applications in combinatorics and difference equations |121 113[ 
[T4] . Thus, we refer to weighted automata over a one- letter alphabet as counting 
automata. In our work, we further explore some of the properties of counting 
automata. 

Suppose we are interested in computing the weight of all paths of length n 
(equivalently, all paths with label x") in an automaton A, starting at a specific 
state qo- Then we would look at all paths of length n starting at go: compute the 
weight of each of these paths, and add up these weights. We call this the behavior 
of the state qo and denote it by ||^|L . Then 

Mllgo (^") = Xl'fll-^ll : 1^1 = '^ and P starts at go}- 
p 

Notice that, given any state qo, \\A\\ S K {{{x}*)) ^ K^\ Since ||.4|j assigns to 
every word x" an element c„ G isT, we can identify \\A\\ : x" 1— >■ c„ with a function 
/o : n !—> c„. Hence, in a counting automaton, every state qo generates a function 
/o G K^, and thus we can identify each state with the function it generates. The 
idea of associating a function to each state of an automaton goes back to classical 
automata theory (see [31 16]). 

In what follows, we will assume that the initial weights of the states of an au- 
tomaton A are either or 1. Those with a weight of 1 will be the initial states, 
and those with a weight of will be non-initial. Denote the set of initial states by 
J4. For the moment, suppose that I a = Qa, so that every state is allowed to be 
an initial state. In the next section we will assume that /^ C Q^- We will allow 
more freedom to the way we define the final weights of the states of an automaton. 
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Those states with a non-zero weight will be the final states, and those with a weight 
of will be non- final. We will denote the set of final states by Fa- 

Now consider an arbitrary state /q e Qa- Let {fi, . . . , fk} be the set of states 
that can be reached from /o through paths of length 1, with transition weights 
ai, . . . , flfc, respectively. Suppose that the final weight of state /o is cq, and that 
the final weights of states /i, . . . , /fe are ci, . . . , Cfc, respectively. A graphical repre- 
sentation of this is 




Figure 1 . Paths of length 1 starting at state /o 



Let /o e K'^ be the function generated by state /o, and let /i,...,/fc G K^' 



(3.1) 



be the functions generated by states /i, . . . , /fc, respectively. It can be shown (see 
[H]), that 

/o(0) = CO, 

fo{n + 1) = ai/i(n) + 02/2(^1) + . . . + akfk{n), for n > 0. 

Notice that Eqs. 13.11 provide a recursive definition of the function generated by 
each state of a counting automaton. Using this, it can be shown that a counting 
automaton A over K generates a system of linear recurrence equations. And by 
definition, these are the only equations recognized by A. 

Theorem 3.1. i^|12] ) Suppose that A is a counting automaton over a semiring K . 

aik 




flfei 



The functions /i, /2, . . . , /fc € K^ generated by A satisfy the following system of 
linear recurrence equations. 

k 

(3.2) /,(n + l)=^ay/,(n), /,(0) = c,;, l<i<k 
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Conversely, given this system of linear recurrence equations, the counting automa- 
ton recognizing it is precisely A. 

Example 3.1. Higher-Degree Systems 

Consider the foUowing system of hnear recurrence equations over an arbitrary 
semiring K. 

/i(« + 4)= ai2/2(n), /i(0) = ci 

/2(n + 1) = a2i/i(n) + a22f2{n), /2(0) = C2 

Note that the degree of /i is 4. Theorem 13.11 guarantees that we can build an 

automaton recognizing equations of degree 1. In order to use this result, we need 

to introduce additional functions that act as intermediate states. The functions we 

need can be defined as follows. 

<?i('^ + l) = /i(^ + 2), 5i(0) = /i(l) = di 
g2{n + l)=h{n + 3), 52(0) = /i(2) = ^2 
<?3(n + l) = /i(n + 4), 53(0) = /i(3)-d3 

Using these auxiliary functions, we can rewrite the original system of equations 
as a system of equations of degree 1. 

fiin+l)= giin) 

gi{n+l)= g2(n) 

g2{n+l)= gain) 

g3{n + l)= ai2.f2(n) 

f2{n + 1) == a2i/i(n)+ 022/2 (»^) 

Now we can use Theorem 13.11 to build the automaton that recognizes the given 
system of linear recurrence equations. 

022 




021 

Figure 2. Automaton recognizing the system of recurrence equa- 
tions in Example 13.11 



Theorem l3.1l above shows that a counting automaton over a semiring K generates 
a system of linear recurrence equations with coefhcients in K. In the next section 
we will restrict our attention to the case where K = P(E*). Specifically, we will 
consider its subset 'P(S). Our goal is to define the language recognized by a counting 
automaton over 'P(S), and to define what it means for a language to be recognized 
by a system of linear recurrence equations with coefficients in P(S). One of the 
implications of defining languages this way is that we obtain an immediate partition 
of the language into its cross-sections. We will see that it is also possible to define 
these languages through formal grammars, and we will show that these languages 
are closed under union, concatenation, and the Kleene star. Using this, we will 
prove that the languages recognized by linear recurrence equations with coefficients 
in ■P(S) are precisely the regular languages. 
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4. Language Recognition, Language Partition, and the 
Cross-Sections of a Regular Language 

In this section, the semiring we use for the weights of the automata and for 
the coefficients of the recurrence equations is V{T,*). In particular, we wiU only 
consider weights and coefficients in 7'(S) C 7'(S*). 

Suppose that ^ is a counting automaton with weights in 7'(S), and assume that 
the set of states of ^ is {/i, /2, . . . , /fe}. As it is customary when using automata for 
language recognition, we will assume that there is only one initial state. Without 
loss of generality, we will let the first state be the initial state. Therefore, /_4 — {/i}. 
We now specify the final weights of the states in A. Non-final states were defined 
as states that have a final weight of 0; in the semiring of languages, 0. Final states 
were defined as states with a non-zero final weight. Specifically, we will assume 
that the final states have a final weight of 1; in the semiring of languages, {s}. 
Therefore, /,(0) = if /, ^ F4, and /,(0) = {s} if /, G F4. 

Let ^ be a counting automaton with k states 




Lki 
Figure 3. Counting automaton A over P(E) C V{T,* 



where I_4, = {/i}, Q ~ {e} if fi G F4, Ci = % \i fi ^ Fj\^, and for every i and every 
j, Lij G 'P(S). (If Lij = 0, we can eliminate this transition from the diagram.) We 
know that /i, the initial state, generates a function /i : N ^ 7'(E*) defined by the 
system 

k 

(4.1) Mn + 1) = U Uj ■ fj{n), MO) - c„ l<i<k. 

Since Lij G 7^(5]), we have that for every n G N, ,fi{n) is a language containing 
words of length n. Denote fi{n) by £„. 

Definition 4.1. The language recognized by a counting automaton A over 

7'(E) is denoted by £^ and is defined by £^ = !)£„. 

n 

Thus, a word w of length n belongs to £^ if w belongs to £„. We will say that 
a word w of length n is recognized by the counting automaton A if there is a path 
of length n starting at /i and ending at a final state with weight {w}. 
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Given that the languages £„ are defined recursively, we can also define the 
language M £„ in the following way. 

n 

Definition 4.2. £ = I J^n is known as the language recognized by linear 

n 

recurrence equations, since the languages £„ are defined via linear recurrence 
equations. 

Remark. A consequence of recognizing a language £ via linear recurrence equations 
with coefficients in ■p(S) is that the language is automatically partitioned into sets 
£„, where £„ contains all the words of length n in £, i.e., £« is the nth cross- 
section of the language. 

Since the operations of the semiring of languages are union and concatenation, it 
is not difficult to see that we can also define £^ by using a grammar. We associate 
a nonterminal symbol Ai to every state fi, except that we denote Ai by S, the start 
symbol, since /i is the initial state. The set of terminal symbols is S. Consider an 

arbitrary transition fi — 'A- fj and suppose that Lij ~ {aij,i, o-ij,2, ■ ■ ■ , o.ij,m} is non- 
empty. Then to this transition we associate the productions Ai -^ aij^iAj, Ai — >■ 
aij^2Aj,..., Ai -^ aij^,nAj. Finally, for every state fi G F4 (so /i(0) = {e}) we 
include a production Ai ^ e. Denote this grammar by G^. Then we will also 
define £^ by Ca ~ L{Ga)- 

We now present the closure properties of these languages. 

Theorem 4.1. The languages recognized by counting automata over V(T,) are 
closed under union, concatenation, and the Kleene star. 

Proof. Assume that £^ and £b are the languages recognized by A and B, respec- 
tively, where A and B are counting automata over P(S). We will show that we can 
construct counting automata over 7^(S) that recognize Ca U £e, Ca ■ £b, and £^. 

Let Qa = {fi, f2, ■ ■ ■ , fk} be the set of states of A. Assume that I a = {/i} 
and that Fa is the (nonempty) set of final states of A. Then we know that Ca 
is recognized by an automaton like the one in Figure [31 with weights L^, for 
^ <iij <k. Similarly, let Qg = {31,52, • • • ,3m} be the set of states of B. Suppose 
that /e = {51} and that Fq is the (nonempty) set of final states of B. Then £g is 
also recognized by an automaton like the one in Figure [31 but with weights Lf, , for 
1 < i,j < m. 

We first construct an automaton C recognizing £^U£b. Let Qc = {/iJUQ^UQb 

and Ic = {h}. We let Fc = {h} U Fa U i^g if /i G Fa or gi e Fg. Otherwise, 

Fc = Fa U Fg. Now we just need to specify the transitions. The new automaton 

C will contain all the transitions in A and in B, plus some new ones. For every 

i« . . ^1, 

transition /i — > fjT^^j^ k, we add a new transition h — > fj. Similarly, for 

if. . . ' if. 

every transition gi — > gj, ^ 1^ .j ^ rn, we add a new transition h — > gj. Then C 

recognizes Ca U £b. That is, £c ~ Ca U Cb- 

We now construct an automaton C that recognizes Ca ■ £b . Let Qc = Qa U Qb 

and Ic = {/i}. We let Fc ^ {/i}UFe if /i G Fa and .91 G Fg. Otherwise, Fc = Fb. 

As for the transitions in C, we will include all the transitions in A and in B, as well 

as some other ones. Assume that Fa — {fji , fj^, ■ ■ . , fji}- Then, for every state fi, 

r A r A t A 

1 < i < k, given the transitions fi — ^ fj-^, fi -~A fj^,. . . , fi — -t /-/, , we add a 
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new transition fi -U 51, where if = Lf^^ U Lf^^ U . . . U L^^ . Finally, if /i G F^, 

if, . . if, 

then for every transition gi — > gj, 1 < j < m, we add a transition /i — > gj. By 

construction, we conclude that Cc — Ca ■ ^b- 

We conclude the proof by constructing an automaton C that recognizes C\. Let 

Qc = {h} U Qa, Ic = {h}, and Fc = {h} U i^^. The automaton C will contain all 

the transitions in A, plus some new ones. For each state /j, 1 < j < fc, given the 

transition /i — > fj, we add a transition ft, — > fj, and for every fi G F_a, we also 

add transitions /i — > fj (to the state fj). Then £c = ^a- '-' 

By combining Theorems 13.11 and 14. 1[ we obtain the following. 

Corollary 4.2. The languages recognized by systems of linear recurrence equations 
with coefficients in V{'S) are closed under union, concatenation and the Kleene star. 

We are now ready to prove one of our main results. 

Theorem 4.3. A language recognized by a system of linear recurrence equations 
with coefficients in 7^(S) is regular. Conversely, every regular language is recognized 
by a system of linear recurrence equations with coefficients in 'P(S). 

Proof. Suppose that £ is a language recognized by a system of linear recurrence 
equations with coefficients in V{Ti). Then we know that there is a counting au- 
tomaton A over ■P(S) such that C — Ca- By definition, Ca is generated by the 
grammar Ga provided before Theorem 14.11 Note that this grammar is regular. 
Therefore, £ is a regular language. 

Now we need to show that if a language is regular, then it is recognized by a 
system of linear recurrence equations with coefficients in 'P(S). Recall that the 
set of regular languages over an alphabet S = {01,02, ..., «„} is defined by 
(z) 0, {e}, {oi}, {02}, . . . , {om} are regular, and {ii) if Li, L2 are regular, then Li U 
L2, Li-L2, and LI are regular. It is not difficult to find systems of linear recurrence 
equations with coefficients in V{T,) that recognize the languages in («). Note that 

(4.2) /i(7i + l) = 0-/i(n), /i(O) = 
recognizes 0, 

(4.3) /i(n+l) = 0./i(n), /i(0) = {e} 

recognizes {e}, and 

.,4^ /i(ri + l) = 0-/i(n)U{aJ-/2(n), /i(0) = 

^ ' /2(n + l) = 0-/i(n)U 0-/2(n), /2(0) = {£} 

recognizes {oi}, for each a^ G E. Finally, suppose that Li and L2 are two languages 
recognized by systems of linear recurrence equations. By Corollary 14.21 there are 
systems of linear recurrence equations recognizing LiU L2, Li ■ L2, and L*. D 

Theorem 14.31 shows that linear recurrence equations with coefficients in P(S) 
recognize, precisely, the regular languages. Hence, counting automata over P(S) 
recognize the regular languages as well. By Kleene's Theorem, regular languages 
are recognized by finite automata. Thus, it is natural to translate concepts from 
finite automata theory to counting automata over 'P(S). For example, we can define 
what it means for a counting automaton over V{T,) to be deterministic. 
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Definition 4.3. Let Ahe a, counting automaton over 7^(1]) and let Qa — {/i, /2, 
. . . , /fc} be its set of states. We say that A is deterministic if for every state fi, 
the transition weight languages Ln, Li2, . . . , Lik are pairwise disjoint. 

Note that this is equivalent to saying that given fi € Qa and a € 11, a belongs 
to at most one of the transition weight languages Ln, Li2, . . . , iife. Hence, our defi- 
nition coincides with the classical definition of a deterministic automaton (see [6). 
Notice we have not discussed how to turn a (nondeterministic) counting automa- 
ton into a deterministic one. It should be clear, however, that the techniques to 
accomplish this from finite automata theory can be applied to counting automata 
over V{1). 

Example 4.1. Recurrence Equations and Regular Languages 

Let £ = {ab*a)* and notice that £ is a regular language. Assume that the 
alphabet is S = {a, b}. Then C is recognized by the automaton A below. 




Figure 4. Counting automaton A recognizing {ah* a)* 



Equivalently, C = {ab*a)* is recognized by the system below. 

fi{n + l)= {a}-/2N, /i(0) = {£} 

hin + 1) ^{a} ■ h{n) U {b) ■ ^(n), ^(O) = 

We can write £ as £ = I J £„, where £„ = /i (n) is the nth cross-section of £. If we 



write the system above in matrix form, as 



f2{n + l) 



{a} 
{a} {b} 



f2{n) 



then 



f2{n) 



{a} 
{a) {b} 



Notice that /i(l) = 0, which agrees with the fact 

{aaaa,abba}, 



that the language {ab*a)* has no words of length 1. Similarly, /i(4) 
and note that these are precisely the words of length 4 in {ab*a)* . 

5. Path-Counting and Self-Counting Automata: Calculating the 
Density Function of a Regular Language 

In the previous section we saw how we can use weighted automata and linear 
recurrence equations to recognize a regular language. In particular, we saw how 
this type of language recognition induces a partition of the language into its cross- 
sections. And thus, for each n, we have a way of generating all the words of length n 
in the language. In this section we will see that it is possible to output not only the 
words, but also the number of words, of length n in the language. That is, we show 
how to calculate the density function of the language. Recall that a word of length 
n is recognized by a successful path with the same length. With this connection 
in mind, we start by constructing an automaton that can count, for any given n, 
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the number of successful paths of length n. In order to do this, we introduce the 
notion of a path-counting automaton. 

Definition 5.1. Given a counting automaton A, its path-counting automaton 

^ is a counting automaton over N that is able to count the number of successful 
paths of any given length in A. 

We now show how to construct a path-counting automaton A. Assume that 
the set of states oi A is {fi,f2, ■■■ ,fk}- Then we denote the set of states of A 
by {/i, /2, . . . , /fe}. We let fi be an initial state in A if fi is an initial state in A. 
Suppose that /i(0) = Ci. We let c^ = 1 if fi is final. Otherwise, if fi is non- final, 

we let Ci — 0. Finally, consider a transition fi — ^ fj in A, corresponding to a 
transition fi — ^ fj in A. We let a^ = 1 if aij ^ 0. Otherwise, if Oy = 0, we let 
dij — 0. By Theorem 13. 11 the automaton A generates a system 



(5.1) Mn+l) = J2a^JfJin), /.(O) = c„ l<i<k. 

From the way we defined Ci and dij, a simple proof by induction shows that, if fi is 
an initial state, /i(n) equals the number of successful paths of length n that start 
at fi. 

We now define the self-counting automaton. 

Definition 5.2. Given a counting automaton A and its corresponding path-counting 
automaton A, the self-counting automaton (^A, A) is a counting automaton over 
K X N capable of counting its own successful paths of any given length. 

Essentially, (^A, A) is an extension of the automaton A. If the set of states of A is 
{/i, /2, ■ • ■ , fk}, then the set of states of (^l,^) is {(/i, /i) , (/2, /2) , • ■ • , (/fc, A)}- 
If fi is an initial state, we let {fi,fi) be an initial state, and for every final state 
fj oi A, we let {fj,fj) be a final state of (^A,A). Finally, notice that the weights 
are also ordered pairs. It is clear that if the weight of fi — > fj is aij, then the 
weight of {fi,fi) — > {fj,fj) is {aij, dij). We can think of (y^,.4) as an extension 
of A, where the first coordinate keeps track of the weights of the paths traversed 
(thus mimicking A), while the second coordinate keeps track of the number of paths 
traversed. 

Remark. It is not difficult to see that path-counting and self-counting automata can 
be used to count the number of successful paths of a weighted automaton over any 
alphabet, not just a one-letter alphabet. Since a successful path does not depend on 
the alphabet used, simply identify all the letters in the alphabet, say xi, X2, . ■ . , Xm, 
with a letter x, and use counting automata. 

We now return to our discussion of formal languages. Recall that a word w of 
length n is recognized by a counting automaton A if there a successful path of length 
n in A with weight {w}. Hence, given a counting automaton A, we would expect 
the number of words of length n in £^ to be equal to the number of successful 
paths of length n in A. That is, we would expect the density function to be fi{n). 
However, these two quantities could fail to be equal. Notice that (i) a path could 
recognize more than one word, and {ii) a word could be recognized by more than 
one path. The next theorem shows how to correctly define the function that counts 
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the number of words of length n (the density of the language) and the conditions 
needed on the automaton. 

Theorem 5.1. Let A be a deterministic counting automaton over V{T,). Then 
the density function of Cj^ (the language recognized by A) can be defined via linear 
recurrence equations with coefficients in N. 



Proof. Recall that the function fi{n) in Eq. 15.11 counts the number of successful 
paths of length n in A. As we pointed out, this quantity need not be equal to the 
number of words of length n in £_4. First, we need to account for case (?) above, 
since a path could recognize more than one word. This is because a transition 
weight may contain more than one letter from S. The recurrence equations that 
define the density function will be just like the ones in Eq. 15.11 except that each 
coefficient needs to count the number of letters in the corresponding transition 
weight (instead of just being Os or Is). Given a regular language C recognized by a 
system 

k 

(5.2) f,{n + 1) = U U, ■ f,{n), /,(0) = c„ l<i<k, 

i=i 

we define the following system of linear recurrence equations 

A; 

(5.3) Un + 1) = E I^^J-I fj(")' ^'^(0) = |c,;|, 1 < z < fc, 

where \S\ denotes the cardinality of a set S. 

It is clear that f i (n) is greater than or equal to the number of words of length n 
in Cj,. Note that fi (n) is strictly greater if there is a word recognized by more than 
one successful path. We claim that, since A is deterministic, no word of length n is 
recognized by more than one successful path with the same length, and thus fi(n) 
gives precisely the number of words of length n in C^- (Hence, determinism will 
take care of case (ii) above, where a word could be recognized by more than one 
path.) 

Suppose, on the contrary, that there is a word w recognized by more than one 
successful path. Since A has only one initial state, then there is at least one state 
fi that both paths share, with the property that the transition weights leaving fi 
are not pairwise disjoint. This contradicts the fact that A is deterministic. Thus, 
we conclude that the density function of £^ is fi(n). D 

Remark. The density function of a language C is usually denoted in the liter- 
ature by pc{n) (see (15j). Formal languages can be classified according to their 
density. For example, we say that a language has a constant, polynomial, or ex- 
ponential density if pci^) — fiin) has constant, polynomial, or exponential order, 
respectively. 

Example 5.1. Path-Counting Automata, and a Language of Polynomial Density 
Consider the regular language £ — a*ba*ba*. It easy to see that £ is recognized 
by the counting automaton A shown below. (Note that A is deterministic.) 
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{a} {a} {a} 





X {b} A-S {b} , 

hl% ) ^^^^Ahl%) — ^^H;^/{e}) 

Figure 5. Counting automaton A recognizing a*ba*ba* 




It is not difficult to see that the system that defines the density function fi(n) is 
the one below. 

fi(n + l)-fi(n)+f2(n), fi(0)=0 

f2(n + l)= h{n)+f3{n), f2(0) = 

f3(" + l)= hin), f3(0) = l 



We could write this system in matrix form, as 



i(n + l)1 




"1 1 0" 




rfi(")i 


2(n + l) 


= 


1 1 




f2(n) 


3{n + l)\ 




1 




[hin)\ 



Then 



[fiW] 




"1 1 0" 


n 


0' 


f2in) 


= 


1 1 







[hin)\ 




1 




1 



Alternatively, we could obtain an explicit for- 



mula for fi(n) in the following way. Notice that f3(n) = 1 for n > 0, and hence 
hi''^) — ^* fo^ ^* ^ 0- Using this, we obtain that fi(0) — and, for n > 1, 

fi(n) = fi(n - 1) + (n - 1). Thus, if n > 1, fi(n) == -^— — -. We conclude that 

Tiifi 1 ) 

the language a*ba*ba* contains exactly words of length n, for n > 1. 

Example 5.2. Path-Counting Automata, and a Language of Exponential Density 
Consider again the language £ = {ab*a)* from Example 14.11 recognized by the 
counting automaton A in Figure S) This automaton is, clearly, deterministic. It is 
not difficult to see that the density function is defined by the following system. 

fi(n+l)= hin), fi(0)-l 

hin + l)=hin) + hin), f2(0) = 
Notice that fi(n + 2) ==f2(n+l) = fi(n) -KfaH = fi(n) -|-fi(n-h 1), with fi(0) == 1 
and fi(l) — 0. Hence, if F„ denotes the nth Fibonacci number, we have that 
fi(ri) — Fn-i, for n > 1. We conclude that if n > 1, the number of words of length 



n in {ab*a)* is 



ip'"-^ - (1 - VJ)""^ , 1 

— , where ip = — 



V5 



is the golden ratio. 



V5 ' " 2 
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